# Small-scale Multimodal
Tinyllava Phi 2 SigLIP 3.1B
Apache-2.0
TinyLLaVA-Phi-2-SigLIP-3.1B is a small-scale large multimodal model with 3.1B parameters, combining the Phi-2 language model and SigLIP vision model, outperforming some 7B models.
Image-to-Text
Transformers

T
tinyllava
4,295
16
Tinyllava 3.1B
Apache-2.0
TinyLLaVA is a small-scale large multimodal model framework that significantly reduces the number of parameters while maintaining high performance. The 3.1B version outperforms similar 7B-scale models in multiple benchmarks.
Text-to-Image
Transformers Supports Multiple Languages

T
bczhou
184
26
Featured Recommended AI Models